(以下文章擷取自筆者的 blog,分享給大家)
上篇筆者說明了一些原理說明,這篇就來做個實驗。
有一個不錯用的工具 mysql_install_db.exe 可以直接開新的 Instance。
筆者用這個先開一個 Instance 在 Port 3307。
設定原來在 3306 的為 Master (server-id=1
),3307 的為 Slave (server-id=2
),都開啟 binlog (log-bin
)。
Event 記得設成 Disabled on Slave。
接下來備份 Master
E:\backup>mysqldump -u sujunmin -p --master-data --all-databases --events --routines --gtid > all_db.sql
Enter password: ************
其中幾個重要參數
參數 | 意義 |
---|---|
master-data |
dump 出來的掛 master ID |
events |
包含 events |
routines |
包含 functions 與 store procedures |
gtid |
產生 CHANGE MASTER TO master_use_gtid 語法 |
觀察一下 all_db.sql
的內容
-- MySQL dump 10.16 Distrib 10.1.17-MariaDB, for Win64 (AMD64)
(中間省略)
--
-- Position to start replication or point-in-time recovery from
--
-- CHANGE MASTER TO MASTER_LOG_FILE='WSTest-bin.000002', MASTER_LOG_POS=3732;
--
-- GTID to start replication from
--
CHANGE MASTER TO MASTER_USE_GTID=slave_pos;
SET GLOBAL gtid_slave_pos='0-1-770';
--
-- Current Database: 'master'
--
(後面省略)
可以看到如果要用舊的方式(兼容 MySQL Replication) 與新的方式。
啟動 Slave 的 Instance。
把 all_db.sql
倒到 Slave (3307) 上面。
在 Master 上開一個 User lsuser
作為 Replication 用,權限是 REPLICATION SLAVE
與 SUPER
(這個我不設定會沒權限登入,但是官方網站沒有這個權限)。
在 Slave 上設定 CHANGE MASTER
MariaDB [(none)]> change master to master_host='localhost', master_port=3306, master_user='lsuser', master_password='password';
Query OK, 0 rows affected (0.05 sec)
START SLAVE;
SHOW SLAVE STATUS;
MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: localhost
Master_User: lsuser
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: WSTest-bin.000002
Read_Master_Log_Pos: 5429
Relay_Log_File: WSTest-relay-bin.000002
Relay_Log_Pos: 5757
Relay_Master_Log_File: WSTest-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
(省略)
Using_Gtid: Slave_Pos
Gtid_IO_Pos: 0-1-775
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: conservative
1 row in set (0.00 sec)
看一下同步狀況
移除 Master 模擬 Master 壞掉了狀態
sc stop mariadb
sc delete mariadb
Slave 沒法連到 Master 了
MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Reconnecting after a failed master event read
Master_Host: localhost
Master_User: lsuser
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: WSTest-bin.000002
Read_Master_Log_Pos: 8223
Relay_Log_File: WSTest-relay-bin.000002
Relay_Log_Pos: 8551
Relay_Master_Log_File: WSTest-bin.000002
Slave_IO_Running: Connecting
Slave_SQL_Running: Yes
(省略)
Last_IO_Errno: 2003
Last_IO_Error: error reconnecting to master 'lsuser@localhost:3306' - retry-time: 60 retries: 86400 message: Can't connect to MySQL server on'localhost' (10061 "Unknown error")
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: Slave_Pos
Gtid_IO_Pos: 0-1-784
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: conservative
1 row in set (0.00 sec)
STOP SLAVE
打開 Slave 的 Event 服務
繼續服務
重建 Master
E:\backup>mysqldump -u sujunmin -p --port 3307 --master-data --all-databases --events --routines --gtid > all_db.sql
Enter password: ************
觀察一下 all_db.sql
的內容
-- MySQL dump 10.16 Distrib 10.1.17-MariaDB, for Win64 (AMD64)
(中間省略)
--
-- Position to start replication or point-in-time recovery from
--
-- CHANGE MASTER TO MASTER_LOG_FILE='WSTest-bin.000003', MASTER_LOG_POS=12441;
--
-- GTID to start replication from
--
CHANGE MASTER TO MASTER_USE_GTID=slave_pos;
SET GLOBAL gtid_slave_pos='0-2-976';
--
-- Current Database: 'master'
--
(後面省略)
啟動 Master 的 Instance。
把 all_db.sql
倒到 Master (3307) 上面。
在 Slave 上設定 CHANGE MASTER
為 current_pos
MariaDB [(none)]> change master to master_host='localhost', master_port=3306, master_user='lsuser', master_password='password', master_use_gtid=current_pos;
Query OK, 0 rows affected (0.05 sec)
START SLAVE;
SHOW SLAVE STATUS;
MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
(省略)
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Error: connecting slave requested to start from GTID 0-2-980, which is not in the master's binlog. Since the master's binlog contains GTIDs with higher sequence numbers, it probably means that the slave has diverged due to executing extra erroneous transactions'
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_SSL_Crl:
Master_SSL_Crlpath:
Using_Gtid: Current_Pos
Gtid_IO_Pos: 0-2-980
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: conservative
1 row in set (0.00 sec)
STOP SLAVE;
在 Slave 上設定 CHANGE MASTER
為 slave_pos
MariaDB [(none)]> change master to master_host='localhost', master_port=3306, master_user='lsuser', master_password='password', master_use_gtid=slave_pos;
Query OK, 0 rows affected (0.05 sec)
START SLAVE;
SHOW SLAVE STATUS;
MariaDB [(none)]> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: localhost
Master_User: lsuser
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: WSTest-bin.000003
Read_Master_Log_Pos: 7177
Relay_Log_File: WSTest-relay-bin.000003
Relay_Log_Pos: 7466
Relay_Master_Log_File: WSTest-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
(省略)
Using_Gtid: Slave_Pos
Gtid_IO_Pos: 0-1-1167
Replicate_Do_Domain_Ids:
Replicate_Ignore_Domain_Ids:
Parallel_Mode: conservative
1 row in set (0.00 sec)
0-2-976
, 0-1-1167
) 都幫你做好了CREATE MASTER to ..., master_use_gtid=slave_pos;
),因為 slave_pos
可以是多個的,例如 gtid_slave_pos='0-1-123,1-2-456,...'
stop slave;
set global sql_slave_skip_counter=1;
start slave;
select sleep(10);
show slave status;
雖然有人說不要太常用 sql_slave_skip_counter
,但這個是我覺得重建外最快的方法了,到現在還沒有碰到其他問題。